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ABSTRACT 


This  report  discusses  methods  for  predicting 
future  values  of  discrete  time  series  from  past  observed 
values  of  the  time  series.  The  points  at  which  the  auto¬ 
correlation  function  is  computed  are  the  past  points  used 
in  the  analysis.  Classical  ideas  are  reviewed,  and  then 
extended  to  handle  more  advanced  time  series  problems. 
Applications  of  these  results  are  explainedfor  seakeeping 
applications  of:  (a)  long  range  ocean  activity  prediction, 
(b)  short  term  roll  prediction,  and  (c)  vibration  response 
prediction.  Further  material  appear  s  on  statistical  tests 
for  coefficient  determination,  and  on  digital  computer 
requirements. 


IV 


1.  INTRODUCTION 


In  the  subsequent  discussion  it  will  be  assumed  that  a  discrete  random 
process,  namely,  a  digitized  process  is  being  analyzed.  The  process  will  be 
assumed  to  be  a  time  series  x(t),  which  is  defined  as  a  set  of  observations 
taken  in  time  sequence.  Much  physical  data  is  obtained  in  this  way  either 
directly  or  from  digitizing  the  original  continuous  data.  The  procedure  to 
be  considered  will  be  that  of  using  statistical  multiple  regression  techniques 
in  order  to  perform  a  linear  least  squares  extrapolation  for  the  future  employ¬ 
ing  the  previously  observed  values  of  the  time  series. 

The  past  observed  values  of  the  variable  at  certain  special  points  will 
be  thought  of  as  separate  variables.  More  particularly,  the  points  at  which 
the  autocorrelation  function  is  computed  will  be  the  special  points  in  the 
regression  equation.  For  example,  if  the  autocorrelation  function  is  computed 
at  the  time  delay  r  =  1  second,  2  seconds,  etc.,  up  to  n  seconds,  then  the 
variables  will  be  the  value  of  the  time  series  x(t)  at  x(-l)  seconds,  x(-2) 
seconds,  etc.,  up  to  x(-n)  seconds.  In  general,  there  will  be  n  variables 
present  in  the  regression  equation.  The  final  objective  will  be  to  obtain  a 
regression  equation  as  a  function  of  previous  values  of  the  time  series  so  as  to 
predict  an  extrapolated  value  of  the  time  series  for  some  time  in  the  future. 
This  general  type  of  analysis  is  sometimes  termed  "autoregressive"  analysis. 

These  matters  are  discussed  in  detail  in  the  following  sections,  and 
are  illustrated  on  appropriate  physical  applications.  Sections  2  and  3  review 
classical  ideas  on  multiple  regression  techniques,  while  Sections  4  and  5 
are  devoted  to  more  advanced  time  series  concepts.  Application  of  these 
results,  particularly  to  seakeeping  problems,  are  developed  in  Section  6. 
Further  material  on  statistical  tests  and  on  digital  computer  requirements 
appears  in  Sections  7  and  8. 
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2.  REGRESSION  FOR  A  SINGLE  PREDICTOR  VARIABLE 

As  an  introduction  to  the  basic  concept  of  regression,  the  case  for  a 
single  predictor  value  x  will  be  reviewed.  Ref.  [l]  •  Assume  a  random 
variable  y  exists,  with  mean  zero,  and  also  assume  that  y  may  be 
theoretically  expressed  as  a  linear  function  of  a  variable  x  plus  a  random 
error  €  .  That  is 

y  =  (3x  +  e  (1) 


where  (3  is  an  unknown  constant  to  be  determined.  The  usual  estimation 
procedure  is  to  calculate  an  estimate  of  P  by  the  method  of  least  squares. 
Assume  one  has  collected  N  observations  each  of  x  and  y  denoted  by 
(x  ,  y^),  i  =  1, 2,  ....  N.  One  then  wants  to  minimize  the  quantity 

N  N 

X  (yj  -  y;)z  =  X  (Yt  -  bxi>2  (2) 

i=  1  i=  1 

where  the  quantity  (3  is  replaced  by  b  in  Eq.  (2)  and  y  =  bx  to  indicate  that 
a  sample  estimate  b  of  P  will  be  obtained  rather  than  the  true  population 
value  p.  The  hat  (^)  is  used  over  the  y  to  indicate  y  is  a  predicted  or 
estimated  value  of  y. 

Differentiating  with  respect  to  b  and  equating  to  zero,  one  obtains 


b  = 


N 

£  x-y- 

ri 

i=  1 _ 

N  2 
£  x 
i=l  1 


r  s  s 

xy  x  y 

2 

s 

X 


s 

=  r 

xy  s 


where  is  the  sample  correlation  coefficient  given  by 


r  = 

xy 


]Tx,y. 

i  l 


£x.  y./N 

*-  i  l 


s  s 
x  y 


(3) 


(4) 


2 


The  value  of  b  given  by  Eq.  (3)  is  the  optimum  least  square  estimate  of  (3. 

The  sample  variances  s  2  and  s  2  in  Eqs.  (3)  and  (4)  are  defined  by 

x  y 


N  2 
)  x- 
i= 1  1 


2 

s  = 

X 


N 


(5) 
2 

Then  s  ,  the  sample  standard  deviation,  is  the  positive  square  root  oi  s 

x  2  2  X 
It  should  be  noted  that  s  is  a  biased  estimator  of  the  true  variance  tr  . 

x 

For  large  N  >  30,  however,  the  bias  is  insignificant. 

The  variance  of  b  may  be  shown  to  be 

2 


Var  (b)  -- 


(6) 


Under  the  assumptions  of  an  underlying  normal  distribution  the  quantity 

b  -  S 

n  —  - El _ 


Vtry/^X  i 


has  a  normal  distribution  with  zero  mean  and  unit  variance.  Also,  it  may  be 
shown,  see  Ref.  £  3  J  ,  that  the  quantity 

2  D Yi  -  vj2 


has  a  X  distribution  with  (N  -  2)  d.f.  Therefore,  this  implies  that  the 
statistic 


u  ,  (N-2)£x. 

t  =  - ==r  =  (b  -  (3) 


v/Vn-2 


By,  -  W 


(7) 


has  a  t  distribution  with  (N  -  2)  d.f.  Hence,  the  hypothesis  that  b  =  (3  may 
be  tested  at  a  desired  level  of  significance  a  for  a  specified  value  of  (3. 


3.  REGRESSION  FOR  TWO  AND  MORE  PREDICTOR  VARIABLES 

The  case  for  one  predictor  variable  is  easily  generalized.  This  will  be 
illustrated  first  for  the  two  variable  case  and  then  for  n  variables. 

For  two  variables  one  desires  to  compute  the  coefficients  b^  and  b^  in 
the  regression  equation 

y  =  Vi +  b2xz  (8) 

The  sum  of  squares  to  be  minimized  for  this  case  is 


X>i  -  ?i>2  =  E(Vi  -  b!x!  -  b2X2  )Z 

i  i 


(9) 


Differentiating  partially  with  respect  to  b^  and  b^  ,  one  obtains  two  simul¬ 
taneous  linear  equations  which  may  be  solved  for  b^  and  b^.  These  are 


N 


>1  Ti  xt 


+  b. 


i=  1  i 
N 

t  *2. 

1=  l  11 


N 
c- > 

N 

L  x,  x2  = 

f-,  yiXl. 

i=  1  l 

i=  1  li 

N 

N 

’2  i  *2  = 

i~  1  i 

£  yiX2. 
1=  1  1 

(10) 


The  coefficients  b^  and  b^  are  usually  referred  to  as  the  sample  partial 
regression  coefficients. 

The  generalization  to  k  variables  ,  i  =  1,  .  .  .  ,  k  follows  directly. 
In  this  case  one  wants  to  determined  coefficients  b.  ,  i  =  1,  .  .  .  ,  k  to  obtain 

i 

a  regression  equation 

^=blXl+b2X2  +  '-  -  +  Vk  (U) 

The  sum  of  squares  to  be  minimized  becomes 

2>i  "  y/  =  2>i  '  blXl.  '  b2X2.  '  *  •  *  -  Vk/  (12) 

11  1 
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The  result  after  minimizing  the  sum  of  squares  is  k  simultaneous  equations 
which  may  be  solved  for  the  ,  namely, 


b!^Xl2  +b2£x!  x2.  +  ••• 

+  bk£xi.xk. 

■fVi. 

1  11 

i  i 

1 

+  bk^x2.xk. 

=  £yiX2 

i  i  i 

1  1 

1 

bl£Xl*k.  +b2^X2Xk.+  " 
11  11 

•+bkE*k. 

1 

=  £vk. 

i 

One  should  be  careful  to  contrast  this  model  with  a  correlation  analysis. 
Here  it  is  assumed  that  y  is  a  random  variable  of  which  readings  are  taken 
as  the  x^  are  varied  through  predetermined  values.  In  this  case  the  x, 
are  not  assumed  to  be  arbitrary  random  variables.  In  contrast,  for  a 
classical  statistical  correlation  analysis,  one  usually  assumes  a  joint  (k+1) 
dimensional  normal  distribution  where  the  y  variable  and  the  x^  variables, 
i  =  1,  .  .  .  ,  k,  are  all  assumed  to  be  random.  That  is,  one  has  not  controlled 
the  variables  x^  but  has  performed  some  experiment  where  all  the  variables 
have  random  outcomes.  In  this  case  one  wants  to  estimate  all  the  correla¬ 
tions  between  the  variables.  In  other  words,  one  wants  to  estimate  the 
correlation  or  covariance  matrix  of  the  (k+1)  dimensional  normal  distribution. 

To  explain  further,  in  a  true  correlation  analysis  one  collects  observa¬ 
tions  on  all  the  variables  of  interest  where  none  of  the  variables  are  controlled 
but  only  observed.  However,  in  a  true  regression  analysis  one  would  control 
one  variable,  say  for  example  pressure,  and  then  observe  another,  say 
temperature,  as  pressure  was  stepped  through  a  predetermined  range  of 
values.  In  this  case,  only  the  temperature  variable  would  exhibit  random 
fluctuations.  In  practice,  however,  one  is  not  usually  able  to  control  variables 
through  predetermined  ranges  as  is  theoretically  required.  This  gives  rise  to 
no  real  practical  problems,  though,  since  if  underlying  normal  distributions 
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are  assumed,  the  computational  procedures  for  obtaining  correlation  co¬ 
efficients  are  the  same  as  the  initial  computations  required  for  obtaining  the 
coefficients  of  the  regression  equation. 


4.  APPLICATION  TO  TIME  SERIES 
Consider  now  the  situation  where  one  has  collected  N  +  k  observations 
from  a  stationary  random  process  as  a  function  of  time  which  has  a  zero 
mean  value.  Suppose  that  these  are  equally  spaced  observations  and  are 
denoted  by  x^  ,  x^  ,  ....  x^+^  where  x^  is  the  first  observed  value  and 
x^+^  is  the  last.  These  observations  could  be  obtained  by  reading  N  values 
from  an  analog  record  or  they  might  naturally  arise  at  discrete  points  as  in 
certain  digital  processes.  In  order  to  perform  an  autocorrelation  analysis 
on  the  data,  compute  the  sample  autocorrelation  function  at  k  points  as 
defined  by  the  formula 


R 

xx 


N 


Z. 

1 


x.x 
J  J+i 


i  =  0,  1 . k  (14) 


It  will  be  assumed  that  k  is  much  less  than  N,  that  is,  there  are  many 
more  observations  available  than  points  at  which  the  autocorrelation  function 
is  computed. 

A  more  convenient  quantity  to  work  with  is  the  normalized  autocorrelation 
function.  This  is  defined  by  dividing  R^ji)  by  R^jO).  In  equation  form 
one  has 

R  (i) 

r  (i)  =  — —  ,  i=  0,  1 _ ,k  (15) 

**  R  (0) 
xx 

These  quantities,  r  (i)  ,  are  called  correlation  coefficients  and  it  mav  be 
xx 

easily  proved  that  they  are  bounded  in  absolute  value  by  unity.  That  is 


-1<  r  (i)  <  1  ,  i  =  0,  1 ,  k 

xx 


(16) 
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After  the  correlation  function  has  been  computed,  a  regression  analysis 
may  follow.  The  object  of  the  regression  analysis  would  be  to  obtain  a  linear 
equation  which  would  be  used  to  predict  ahead  (extrapolate)  in  the  time  series. 
This  is  a  reasonable  objective  since  if  there  are  high  correlations  indicated 
by  very  high  peaks  in  the  correlation  function  at  certain  time  delays,  then 
this  implies  some  prediction  can  be  made  this  far  ahead  in  the  time  series. 
For  example,  if  there  is  a  peak  in  the  correlation  function  at  i  =  10  seconds, 
then  as  the  series  is  being  observed  and  data  is  being  collected  in  real  time, 
say,  one  would  expect  to  be  able  to  predict  ahead  approximately  10  seconds 
with  much  greater  accuracy  than  at  other  times. 

Now  it  is  desired  to  calculate  coefficients  b  ,  i  =  1,  2,  .  .  .  ,k  for  a 
regression  equation  of  the  form  of  Eq.  (11),  namely, 

VVl*b2X2+'-  -  +Vk  <17) 


In  this  case,  the  variables  x^  ,  x2  >  •  •  •  »  ^  all  represent  observations  from 
the  given  time  series  as  opposed  to  being  different  variables.  To  be  specific, 
x^  =  the  present  observed  value  of  the  time  series,  x^  =  the  value  observed 
one  time  unit  in  the  past,  and  so  on  up  to  x^  which  represents  the  observa¬ 
tion  (k  -  1)  time  units  in  the  past.  The  variable  x^  is  displaced  one  time 
unit  from  x^  and  therefore  represents  a  value  of  the  time  series  one  time 
unit  into  the  future.  This  is,  of  course,  the  prediction  to  be  made. 

To  be  more  precise  in  notation,  Eq.  (17)  should  be  written  with  sub¬ 
scripts  as  follows. 


/V 

X 


t+1 


=  blV  b2Xt-l  + 


+  b,  x 

k  t  -k  + 1 


(17a) 


In  words,  all  variables  are  translated  with  respect  to  t.  However,  avoiding 
the  more  complicated  subscripts  simplifies  notation  and  should  cause  no 
confusion. 
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4.1  COMPUTATIONAL  DETAILS 


The  coefficients,  b^ ,  will  be  found  from  solving  a  set  of  simultaneous 
linear  equations  similar  to  those  indicated  by  Eq.  (13)  using  the  values  of 
the  autocorrelation  function  which  have  already  been  computed.  The  set  of 
simultaneous  linear  equations  may  be  re-written  in  a  slightly  different  form 
as  follows. 

b  R(0)  +  b  R(l)  +  .  .  .  +  b  R(k  -  1)  =  R(l) 

1C.  K 

b.R(l)  +  b  R(0)  +  .  .  .  +  b  R(k  -  2)  =  R(2) 

1  C  .K 

(18) 


b  R(k  -  1)  +b  R(k  -  2)  +  .  .  .  +  b  R(0)  =  R(k) 

ic.  K 

In  the  above  equation  R(i)  =  R^_^(i)  to  simplify  notation. 

The  solution  of  this  set  of  k  linear  equations  requires  essentially  the 
inversion  of  a  k  by  k  matrix.  If  k  is  large,  say  on  the  order  of  30  or  40, 
which  would  not  be  at  all  unreasonable,  the  necessary  matrix  inversion  would 
be  a  considerable  computational  task,  even  on  a  digital  computer.  Therefore, 
it  would  seem  to  be  advisable  to  restrict  one's  attention  to  only  those  points 
in  the  correlation  function  which  exhibit  a  fairly  high  peak  as  determined  by 
some  method.  By  confining  one's  attention  to  only  those  points  in  the  corre¬ 
lation  function  which  are  significantly  different  from  zero,  one  can  reduce 
the  order  of  the  matrix  to  be  inverted  considerably.  It  would  be  most  desir¬ 
able  if  all  non- significant  points  in  the  correlation  function  could  be 
eliminated  from  consideration  entirely.  Unfortunately,  this  is  not  the  case 
as  will  be  illustrated  in  the  derivation  of  the  necessary  least  squares 
equations  below. 

As  an  example,  assume  that  significant  peaks  in  the  correlation  func¬ 
tion  occur  at  points  3,  4,  8,  and  11.  The  variables  which  appear  in  the 
regression  equation  are  therefore  x^  ,  x^  ,  Xg  ,  and  x^  .  The  coefficients 
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in  Eq.  (18) 


to  be  estimated  from  the  data  are  denoted  by  b_  ,  b  ,  b  ,  and  b, 

3  4  a  11 

below. 


A 

x 


0 


b3X3 


+  b  x  + 
4  4 


Vs 


+  bllXll 


(19) 


The  sum  of  squares  to  be  minimized  is 
N  N 

Z  (xo  -  x</  =  £  <xo  -  b3X3  -  b4x4  -  Vs  ■  bllXll)2  (20) 

i=  1  i=  1 

Differentiating  with  respect  to  the  b^  coefficient,  and  equating  to  zero,  one 
obtains  the  following  set  of  simultaneous  linear  equations. 


b  £x  2 

3^  3 

+ 

b4^X3X4 

+ 

b8^X3X8 

+ 

bll^X3Xll 

=  £X3X0 

b3^X3X4 

+ 

b4^x4 

+ 

b8^X4X8 

+ 

bll^X4Xll 

=  ^X4X0 

b3ZX3X8 

+ 

b4^X4X8 

+ 

b8^X8 

+ 

bl£X8Xll 

=  £X8X0 

b3^X3Xll 

+ 

b4^X4Xll 

+ 

b8^X8Xll 

+ 

bl£  "A 

=  £xnxo 

In  the  above  equations  all  summations  are  assumed  to  run  from  i  =  1  to  N. 
Note  that  it  is  assumed  that  N  +  11  observations  are  available  so  that  all 
points  of  the  correlation  function  are  based  on  N  observations. 

Rewriting  the  above  equations  in  terms  of  the  correlation  function 
values,  one  obtains  the  following  set  of  simultaneous  linear  equations 


b3R{0)  +  b4R(l>  +  bgR(5)  +  bnR(8) 

=  R(3) 

b3R(l)  +  b4R(0)  +  bgR(4)  +  bnR(7) 

=  R(4) 

b3R(5)  +  b4R<4)  +  bgR(0)  +  bnR(3) 

=  R(8) 

bgR(8)  +  b4R(7)  +  bgR{3)  +  bnR(0) 

=  R(1 1) 
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In  the  above  equations  only  R(3),  R(4),  R(8),  and  R(ll)  are  considered  to  be 
significant  points  on  the  correlation  function.  However,  in  deriving  the 
least  squares  equation,  the  points  R(0),  which  is  of  course  the  variance, 
and  R(l),  R(5),  and  R(7)  also  enter  into  the  equations.  However,  the 
necessary  matrix  to  be  inverted  is  now  only  of  order  4,  as  opposed  to  order 
k  if  all  k  points  of  the  correlation  function  were  employed  for  Jtke  prediction 
equation. 

The  general  set  of  equations,  when  one  chooses  some  subset  of  the 

points  of  the  correlation  function,  is  as  follows.  Suppose  one  decides  upon 

r  separate  points  of  the  correlation  function  as  being  significant  peaks. 

Suppose  further  these  are  labeled  a^  ,  a^  ,  •  •  •  >  ar-  The  points  of  interest 

in  the  time  series  are,  therefore,  x  ,  x  ,  .  .  .  ,x  .  The  set  of  simul- 

al  a  2  ar 

taneous  linear  equations  to  be  solved  for  the  b  coefficients  now  becomes 


b 

a 

b 

a 


R(0) 

1 

R(a2-ai) 


+  b  R(a  -a  )  t  .  .  .  + 
a2  2  1 

+  b  R(0)  +  .  .  .  + 

a2 


b  R(a  -a  )  =  R(a  ) 
a  r  1  1 

r 

b  R(a  -a.)  =  R(a~) 
a  r  2  2 

r 


(23) 


b  R(a  -a  )  +  b  R(a  -a_)  +  ...  +  b  R(0) 

a^  r  1  a2  r  2  ar 


R(*r> 


No  particular  systematic  method  will  be  considered  in  this  report 
for  choosing  the  values  of  the  correlation  function  which  are  significant  and 
which,  therefore,  should  be  included  in  the  regression  equation.  In  many 
cases  of  interest,  a  significant  peak  or  peaks  will  be  obvious  in  the  com¬ 
puted  correlation  function.  In  many  other  typical  cases  the  correlation 
function  will  exhibit  a  damped  oscillatory  behavior  as  in  the  widely  observed 
exponential -cosine  autocorrelation  function.  In  this  case  one  should  use  the 
first  several  peaks  (positive  and  negative)  as  variables  for  the  regression 
equation. 
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4.  2  COMPARISON  WITH  OTHER  PROCEDURES 


One  may  compare  the  preceding  results  to  that  of  the  "optimum  Wiener 
linear  predictor."  The  approach  here  as  a  regression  problem  gives  the 
same  results  since  both  procedures  are  based  on  a  least  squares  error 
criterion.  The  optimum  Wiener  linear  predictor  is  essentially  given  by 
Eq.  (17a)  except  that  it  is  usually  presented  in  continuous  integral  form: 

T 

x(t)  =  /  x(t  -  t)  b(r)  dr  (24) 

J0 

where  the  coefficients  b^  or  "weighting  function"  b(-r)  are  obtained  from 
solving  Eq.  (18)  or  the  equivalent. 

Also,  classical  data  smoothing  procedures  extended  to  an  extrapola¬ 
tion  procedure  would  lead  to  the  same  results  if  an  underlying  linear  trend 
is  assumed.  In  this  case  there  is  a  difference  in  concept  involved  since 
one  never  considers  "noise"  extrapolation  or  prediction  when  thinking  of 
smoothing  data.  One  invariably  assumes  observations  are  composed  of  an 
underlying  signal,  e.g.,  an  nth  degree  polynomial,  and  additive  independent 
noise.  One  then  performs  a  "curve  fitting"  procedure  with  a  least  squares 
error  criterion  to  obtain  an  equation  of  the  form  of  Eq.  (17a)  in  the  discrete 
case  or  Eq.  (24)  in  the  continuous  case. 
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5.  EXTENSION  TO  CROSS -CORRELATED  VARIABLES 


The  preceding  regression  techniques  may  be  extended  to  predict  one 

variable  as  a  function  both  of  past  observations  of  itself  and  as  a  function 

of  past  observations  of  another  variable.  The  procedure  for  obtaining  the 

necessary  coefficients  of  the  regression  equation  will  be  directly  analogous 

to  the  preceding  method,  although  now  the  cross -correlation  between  the 

first  variable  x  and  the  second  variable  y  will  be  brought  into  play.  The 

necessary  equations  will  be  obtained  below  for  the  case  where  peaks  in  the 

autocorrelation  function  and  the  cr os s -correlation  function  are  taken  into 

account  to  reduce  the  amount  of  computation. 

Suppose  as  before  that  the  points  of  significance  for  the  autocorrelation 

function  are  at  a,  ,  a.  ,  .  .  .  ,  a  for  the  variable  x.  For  the  cross-correlation 
1  d.  r 

of  x  with  y,  assume  the  significant  points  of  this  function  are  designated 

by  dj  ,  d^  »  •  •  •  »  d  .  The  points  of  the  autocorrelation  function  of  the 

variable  x  will  be  denoted  by  R  (a,),  R  (a.),  .  .  .  ,R  (a  ).  The  values  of 

cross -correlation  function  between  the  variables  x  and  y  will  be  denoted 

by  R  (d,),  R  (d.J,  .  .  .  ,R  (d  ).  The  least  squares  equations  are  obtained 
1  xy  1  xy  2  xy  s 

in  exactly  the  same  way  as  was  done  previously.  After  minimizing  the 
appropriate  sum  of  the  squares,  the  simultaneous  linear  equations  are  as 
shown  in  Eqs.(25).  In  these  equations  (25)  the  relation  R^(i)  =  R  ^(-i) 
has  been  employed.  From  this  set  of  equations  the  coefficients  of  the 
following  regression  equation  are  obtained: 

xQ  =  ba  x(ax)  +  .  .  .  +  ba  x(ar)  +  cd  y(dj)  +  .  .  .  +  cd  y(dg)  (26) 

1  r  1  s 

To  illustrate  these  equations  more  concretely,  assume  that  peaks 

occur  at  R^(l)  and  R^_(3)  in  the  autocorrelation  function  for  x  and  at 

R  (2)  and  R  (4)  in  the  cross-correlation  function  between  x  and  y. 
xy  xy 
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For  this  special  case  four  simultaneous  linear  equations  are  obtained,  namely, 


blRx(°) 

+  b3Rx<2) 

+ 

c,R  (1) 

2  xy 

+ 

c.R  (3) 

4  xy 

■  R*(l) 

blRx(2) 

+  b3Rx<°> 

+ 

c.R  (-1) 
2  xy 

+ 

=  V3' 

b.R  (1) 

1  xy 

+  b.R  (-1)  + 
3  xy 

C2Ry(0) 

+ 

C4Ry(2) 

f\T 

M 

b.R  (3) 

1  xy 

+bR  (1) 

3  xy 

+ 

c2Ry(2) 

+ 

C4Ry(0) 

Ph 

II 

From  these  equations  the  coefficients  for  the  following  prediction  are 
obtained : 

*0  =  blXl  +  b3X3  +  C2y2  +  C4y4  (28) 

Note  that  values  of  the  eiutocorrelation  function  for  y  are  required  and  that 
also  R^(-l)  is  required  or  equivalently  R  x(l). 

It  is  clear  that  the  above  equations  can  easily  be  extended  to  account 
for  cross -correlations  between  more  than  two  variables.  However,  the 
case  of  two  variables  shown  above  adequately  illustrates  the  general  form 
of  the  equations  to  the  general  case  will  not  be  developed  here. 
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6 .  SEAKEEPING  APPLICATIONS 
Applications  of  the  autoregressive  techniques  described  on  the  pre¬ 
ceding  pages  are  performed  most  directly  for  relatively  low  frequency  and 
quasi-periodic  data.  Four  specific  seakeeping  applications  will  be  discussed 
here.  The  first  three  concern  oceanographic  data  analysis,  and  the  fourth 
describes  an  application  to  the  area  of  vibration  and  acoustical  data  analysis. 
This  application  to  vibration  and  acoustics  is  not  involved  with  direct 
analysis  of  an  individual  vibration  record,  but  from  a  broader  standpoint 

of  predicting  vibration  properties  from  other  data. 

* 

6.  1  LONG  RANGE  OCEAN  ACTIVITY  PREDICTION 

Many  applications  exist  in  this  general  area  which  can  be  associated 
with  the  weather.  When  viewed  as  a  time  series,  the  weather  is  somewhat 
periodic  and  of  a  relatively  low  frequency  nature.  In  Reference  j^6,  page  129 
Pierson  suggests  the  possibility  of  applying  the  techniques  described  in  this 
report  for  long  range  weather  forecasting.  Similarly,  since  ocean  wave 
activity  is  influenced  by  the  weather,  these  techniques  could  be  valuable  in 
obtaining  a  long  or  short  range  forecaster  of  gross  ocean  wave  activity. 

For  example,  the  wind  has  considerable  influence  in  the  generation  of 
a  confused  sea  while  air  pressure  and  temperature  might  exhibit  some  more 
indirect  effects.  Also,  ocean  activity  in  one  geographical  area  might  lead 
to  activity  at  a  later  time  in  a  different  geographical  area. 

To  illustrate  this  matter,  assume  one  has  collected  time  histories 
x(t)  and  y(t)  of  some  parameter  of  ocean  wave  activity  at  two  different 
geographical  points,  and  a  time  history  of  the  wind,  z(t),  at  one  of  these 
points.  The  normalized  autocorrelation  function  of  x(t)  and  the  normalized 
cross-correlation  functions  between  x(t),  y(t),  and  x(t),  z{t)  might  appear 
as  pictured  in  Figure  1. 

If  the  correlation  functions  were  as  shown  in  Figure  1,  then  the  auto¬ 
correlation  function  of  x(t)  itself  would  only  be  of  use  for  very  short  term 
predictions.  However,  since  peaks  occur  in  the  two  cross-correlation 
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functions  at  longer  time  delays,  predictions  could  be  made  on  the  basis  of 
this  information  for  relatively  longer  prediction  times.  If  the  time  interval 
for  r  was  one  minute,  then  Figure  1  indicates  that  the  prediction  could  be 
made  for  a  two-hour  period  in  advance.  The  regression  equation  would  be 


A 

x 


0 


C122y122  +  C123yl23 


+  d203Z203 


+  d204Z204 


(29) 


The  linear  equations  in  matrix  form  to  be  solved  to  obtain  the  coefficients 
in  Eq.  (29)  are 
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yy 
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R  (80) 
yz 

V81) 
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yz 
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RJ1> 

d203 

R  (203) 
xz 
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L  yz 

R  (81) 
yz 

R  (1) 

zz 

RZZ<°> 

d204 

R  (204) 
xz 

_  — 

After  this  set  of  equations  has  been  solved,  then  Eq.  (29)  could  be  used  as 
a  predictor  of  ocean  wave  activity. 
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6.2  SHORT  TERM  ROLL  PREDICTION 

The  possibility  of  obtaining  short  term  predictions  utilizing  information 
contained  in  the  autocorrelation  function  of  the  time  series  consisting  of 
certain  ship  motions  is  suggested  by  St.  Denis  and  Pierson  in  Reference  ^7, 
page  35j.  The  practical  use  of  such  a  prediction  would  undoubtedly  be  in  a 
short  term  control  system  of  some  sort  in  a  ship.  Assume  for  example 
that  roll,  denoted  by  x(t),  is  the  motion  being  considered.  As  indicated,  Ref.  ^7 
these  various  ship  motions  look  like  narrow  band  noise  under  an  exciting  force 
of  a  random  sea  due  to  the  fact  that  the  ship  acts  as  a  narrow  band  filter. 

The  typical  autocorrelation  function  that  arises  from  narrow  band  noise  is 
a  damped  exponential-cosine.  Therefore,  experimental  data  might  give 
rise  to  an  autocorrelation  function  as  depicted  in  Figure  2. 


| 

I 
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If  t  is  in  seconds,  then  Figure  2  indicates  that  a  one-second  prediction 
could  be  made  by  use  of  the  regression  equation 

$0  =  Vx  +  b3x3  +  b5x5  +  b?x7  +  b8xg  (31) 

In  order  to  obtain  the  coefficients  in  Eq.  (31),  the  following  set  of  five 
linear  equations  must  be  solved. 


R  (0) 

XX 

R  (2) 

XX 

R  (4) 
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R  (6) 
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XX 
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XX  J 

_V 

R  (8) 

^  XX  J 

If  a  longer  range  slightly  less  accurate  prediction  is  desired,  x^  could  be 
discarded  or  perhaps  both  x^  and 

The  control  procedure  could  possibly  be  implemented  in  one  manner 
by  utilizing  a  modern  day  shipboard  high  speed  digital  computer  such  as  the 
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AN/UYK-1  (TRW-130),  The  system  would  require  some  sort  of  device  to 
sample  the  roll  time  series  x(t),  at  least  once  per  second  and  probably  more 
like  ten  times  per  second.  This  information  would  have  to  be  analog  to 
digital  converted  and  then  fed  into  the  computer.  The  regression  equation, 
such  as  Eq.  (31),  would  then  be  evaluated  with  a  predicted  value  x^ 
obtained.  This  value  could  then  be  processed  appropriately,  output  perhaps 
to  a  digital  to  analog  converter,  and  then  used  by  some  anti-roll  device.  All 
these  procedures  would  be  accomplished  in  "real  time."  Perhaps  every  few 
seconds  a  new  correlation  function  estimate  could  be  developed  and  its  values 
tested  for  significance  by  some  simple  procedure  to  determine  whether  or 
not  to  use  a  point  in  the  prediction  equation. 

A  matrix  of  values  such  as  given  in  Eq.  (32)  would  have  to  be  inverted 
in  order  to  obtain  the  coefficients  for  the  prediction  equation  which  would 
then  replace  the  previously  used  prediction  equation.  This  matrix  would, 
of  necessity,  be  restricted  to  some  maximum  size  so  that  the  inversion 
could  be  performed  in  a  reasonable  amount  of  time.  Evaluating  the  regression 
Eq.  (31)  would  prove  no  problem  timewise  since,  for  example,  the  add  time 
of  the  AN/UYK-1  is  12  microseconds  (ps)  for  a  15-bit  word  and  its  multiply 
time  is  57  ps  maximum.  Therefore,  the  necessary  instructions  to  evaluate 
Eq.  (31)  would  require  about  500  ps,  allowing  for  the  necessary  load  and 
store  operations. 

The  remainder  of  the  problem,  that  is,  evaluating  the  correlation 
function,  inverting  the  correlation  matrix,  and  input/ output  functions  could 
be  performed  on  a  piecemeal  basis  at  a  slower  rate.  Military  computers 
such  as  the  AN/UYK-1  usually  have  interrupt  capabilities  such  that  when  an 
input  or  output  device  is  ready,  it  can  send  a  signal  to  interrupt  the  computer 
processing.  Therefore,  if  it  was  time  for  the  anti-roll  device  to  receive 
information,  it  could  interrupt,  say,  the  matrix  inversion  routine.  The 
computer  could  perform  the  necessary  processing  to  output  information, 
and  then  return  to  the  matrix  inversion.  This  type  of  processing  would 


possibly  allow  for  an  output  rate  of  five  or  ten  control  signals  per  second. 
A  block  diagram  for  this  type  of  system  is  illustrated  in  Figure  3. 


Digital  Computer 


Figure  3 

Possible  Digital  Control  System  for  Ship's  Roll 


In  order  to  efficiently  perform  all  the  required  processing,  it  would 
probably  be  necessary  to  evaluate  certain  quantities,  such  as  the  correlation 

function  in  a  recursive  manner.  For  example,  with  the  first  observation, 

2  2 
x.  ,  one  computes  x.  .  With  the  second  observation  one  computes  x.  ,  and 

1  1  2  2  i+l 

x.x.  ,  and  accumulates  x.  +  x.  ,  ,  .  With  the  third  observation  x.  _  one 

1  1+1  2  1  1+1  2  2  2  1+2 
computes  x.  ,  x.  _x.  ,  x.  _x.  ,  ,  and  accumulates  x.  +  x.  ,  ,  +  x.  ,  and 
r  i+2  i+2  l  i+2  i+l  l  i+l  i+2 

x.x.  ,  +  x.  ,  ,x. ,  This  procedure  would  then  continue  until  sufficient 

li+l  i+l  i+2 

observations  had  been  obtained  to  allow  reliable  correlation  function 
estimates.  Perhaps  more  efficient  approximate  procedures  could  be 
developed. 

The  prediction  procedure  might  even  be  improved  by  including  other 
information,  for  example,  by  cross -correlating  directly  with  a  record  of 
ocean  wave  amplitudes.  Another  possibly  would  be  including  a  cross¬ 
correlation  with  pitch  or  heave  information.  Although,  in  practice,  the 
six  degrees -of-freedom  of  ship's  motion  are  usually  assumed  to  be 
independent,  in  actuality,  the  motions  may  be  correlated  and  use  of  this 
information  might  allow  for  better  predictions  of  motion. 


6.3  OCEAN  WAVE  AMPLITUDE  PREDICTION 


As  a  third  example  for  oceanographic  problems,  suppose  that  a  time 
series  x(t)  consists  of  observations  of  the  height  of  ocean  waves  at  a  given 
point  on  the  surface  of  the  sea.  Assume  that  observations  are  taken  at  one- 
second  intervals.  The  first  step  in  the  analysis  would  be  to  compute  the 
points  of  the  autocorrelation  function  at,  say,  ten-second  intervals  as 
defined  by  Eq.  (14).  The  correlation  function  could  be  expected  to  fall  off 
fairly  rapidly  and  then  a  peak  should  be  encountered  corresponding  to  the 
predominant  frequency  of  the  wave  process.  Assume  that  this  occurs 
at  a  delay  of  30  seconds.  It  is  not  unreasonable  that  another  underlying 
periodicity  might  occur  of  much  greater  period.  This  fact  would  be 
exhibited  by  a  peak  in  the  correlation  function  at  a  greater  time  delay,  say. 
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for  example,  at  150  seconds.  The  over -all  normalized  correlation  function 
for  this  data  might  then  appear  something  like  that  pictured  in  Figure  4. 


Hypothetical  Autocorrelation  Function  for  Ocean  Wave  Data 


For  this  example  one  would  choose  as  variables  in  the  regression 
equation,  x  ,  x  ,  and  finally  x  .  These  variables  correspond  to  the 

10  D U  1 D 0 

points  R(10),  R(30),  andR(150)  of  the  correlation  function.  One  then  wants 
to  estimate  the  coefficients  in  the  following  linear  equation. 


A 

X 


0 


=  bioxio 


b30X30 


b150X150 


(33) 


The  set  of  simultaneous  linear  equations  to  be  solved  would  be  as  follows. 


bioR(°) 

b1QR(20) 

b10R(140) 


+  b3QR(20)  +  b15QR(140) 

+  b3QR(0)  +  b150R(120) 

+  b30R<12°>  +  b150R(0) 


R(  i  o) 
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R{  1 50) 


(34) 
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Once  the  coefficients  b^,  b^Q  ,  and  b^^  have  been  computed,  then 
Eq,  (33)  may  be  used  to  predict  on  the  basis  of  presently  observed  and 
past  observed  data  for  a  period  of  ten  seconds  into  the  future.  A  prediction 
of  ten  seconds  into  the  future  may  or  may  not  be  of  practical  value  as  far 
as  ocean  waves  are  concerned.  However,  in  this  example,  due  to  the  strong 
correlations  exhibited  at  both  30  seconds  and  150  seconds,  it  might  be 
desirable  and  of  interest  to  compute  regression  equations  based  on  x^q  and 
x  ,  or  possibly  even  just  on  x  CA  alone.  In  the  first  case,  one  could 
then  be  predicting  ahead  30  seconds  into  the  future  and  in  the  second  case  one 
could  be  predicting  ahead  150  seconds  into  the  future.  These  predictions 
might  be  of  more  practical  value.  However,  one  loses  precision  in  the 
prediction  when  pertinent  data  is  neglected  such  as  exists  at  x^  which  is 
indicated  by  the  strong  correlation  at  the  point  i  =  10.  The  over -all 
correlation-regression  analysis  described  in  this  example  has  the  value  of 
pointing  out  the  fact  that  in  addition  to  the  basic  "periodicity"  with  a  period 
of  30  seconds,  there  is  an  additional  underlying  "periodicity"  with  a  period 
of  approximately  150  seconds.  This  information  may  or  may  not  have  been 
obvious  from  the  original  data. 

6.  4  VIBRATION  DATA  APPLICATION 

For  a  vibration  data  application,  the  emphasis  of  the  procedure  will 
be  shifted'.  A  problem  that  is  of  interest  is  to  obtain  a  vibration  data  time 
series  as  a  function  of  other  time  series.  For  example,  pressure  trans¬ 
ducers  might  be  mounted  at  various  external  points  on  a  ship's  structure 
and  an  accelerometer  might  be  located  at  an  internal  point  of  interest  on 
the  structure  to  measure  the  vibration  response  at  that  point.  The  pressure 
transducers  at  various  points  on  the  structure  would  effectively  measure 
sources  of  vibration  excitation.  These  exciting  forces  would  transmit 
directly  through  the  structure,  or  acoustically  through  the  air  to  produce 
vibration  at  the  structural  point  where  the  accelerometer  is  located.  There 
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would  be  time  delays  between  the  excitation  and  response  due  to  the  finite 
amount  of  time  that  it  takes  to  transmit  the  vibration  through  the  structure 
or  the  surrounding  medium.  Therefore,  the  vibration  response  at  the 
accelerometer  might  be  obtained  as  a  function  of  the  response  measured 
at  the  various  pressure  transducers  at  some  time  in  the  past.  The  vibration 
response  would  then  be  given  as  a  function  of  lagged  values  of  the  pressure 
variables.  In  this  example  only  the  cross-correlation  analysis  would  be  of 
interest. 

For  purpose  of  this  example,  let  x{t)  represent  readings  taken  from 
an  accelerometer  located  at  a  point  on  a  ship  structure,  and  let  y(t)  and 
z(t)  represent  the  readings  of  two  pressure  transducers  located  at  other 
points  on  the  structure.  The  normalized  cross-correlation  functions  of 
x(t)  with  y(t)  and  of  x(t)  with  z(t)  might  then  appear  as  illustrated  in 
Figure  5. 


Figure  5 

Hypothetical  Cross-correlation  Functions 
for  Vibration  Response  Example 


Variables  to  choose  in  predicting  vibration  as  indicated  by  the  above 

accelerometer  data,  then,  would  be  pressure  transducer  No.  1  readings 

at  lags  of  8  and  10  time  units,  and  pressure  transducer  No.  2  at  a  lag  of 

6  time  units.  Hence,  the  variables  would  be  y  ,  y  ,  and  z,  .  The 

8  10  6 

regression  equation  to  give  accelerometer  readings  as  a  function  of  the 
two  pressure  transducer  readings  is  then 


\  =  C8y8 


c10y10+d6Z6 


(35) 


The  coefficients  of  this  equation  may  be  obtained  from  the  following  set  of 
linear  equations 
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In  this  problem  one  needs  the  autocorrelation  function  for  y(t)  and 
the  autocorrelation  function  for  z(t),  the  cross-correlation  function 
between  x(t)  and  y(t),  and  the  cross -correlation  function  between  y(t)  and 
z(t)  at  negative  values  of  the  lag,  or  equivalently  values  of  the  cross¬ 
correlation  function  between  z(t)  and  y(t)  at  positive  values  of  the  lag. 

For  this  problem  the  end  objective  is  not  an  extrapolation  of  the  vibration 
time  series,  but  rather  to  predict  the  vibration  as  a  linear  function  of  the 
pressure  transducer  readings. 
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7.  STATISTICAL.  TESTS  AND  ERRORS 


After  the  coefficients  in  the  regression  equation  have  been  obtained, 
it  is  desirable  to  apply  a  statistical  test  to  see  if  the  coefficients  ar'1  signifi¬ 
cantly  different  from  zero.  This  is  equivalent  to  testing  if  the  variable 
associated  with  the  given  coefficient  contributes  a  statistically  significant 
amount  to  the  prediction  of  the  time  series.  For  classical  regression 
analysis,  the  variables  are  assumed  to  be  normally  distributed  and  the 
deviations  from  the  predicted  values  used  in  the  regression  equations  are 
assumed  to  be  normally  distributed  and  independent  from  one  prediction  to 
the  next.  However,  in  the  application  of  regression  techniques  to  time 
series,  the  problem  is  more  difficult.  Even  when  the  process  may  be 
assumed  to  be  a  Gaussian  or  normal  process,  it  will  still  have  a  non-zero 
autocorrelation  function.  Hence,  the  residuals  from  the  prediction  will  not 
necessarily  be  independent  from  prediction  to  prediction,  but  will  themselves 
be  correlated.  Fortunately,  for  large  sample  sizes,  it  is  pointed  out,  Ref.  [2] 
that  the  classical  formulas  hold  approximately  true.  This  means  that  classical 
formulas  for  the  standard  errors  and  the  sampling  distributions  of  the 
regression  coefficients  are  asymptotically  valid  even  if  the  residuals  are 
correlated. 

An  approximate  test  for  significance  on  the  sample  regression 
coefficients  may  be  performed  in  the  following  way.  See  Reference  ^3  j  for 
more  details.  Let  F  represent  the  (k+  1)  by  (k+  1)  matrix  of  correlation 
coefficients . 


r  = 


1 


r(i)  ...  'r(k) 

1  ...  r(k-i) 


(37) 


1 


Then,  if  C  is  the  cofactor  of  i-jth  element  in  the  matrix  T,  the  sample 
regression  coefficient  ta  may  be  computed  from 
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I 


b.  =  — 
i  C 


li 


(38) 


11 


The  sample  variance  of  vesidual  errors  in  the  regression,  i.  e.  ,  the 
variance  of  the  distribution  of  the  deviations  of  the  predicted  values  from 
the  true  values,  is  obtained  from  the  formula 


0.  12. 


=  R(0) -p 
,k  c 


(39) 


11 


where  I  r|  is  the  determinant  of  the  matrix  T.  The  standard  errors  of  the 
regression  coefficients  are  given  by  the  formula 


bi-l 


TV 


c,  ,c..  +  c7 

i  i  ii  ii 


N 


(40) 


This  is  derived  from  manipulation  of  formulas 
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=b. 


I 

N 
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2  (1  P0i.  23. 

Si.  34.  .  .k 


(41) 


and 


1,  i+1 


Oi.  34. 


CllCi+l,  i+1 


(42) 


which  may  be  found  in  Chapter  27  of  Reference  j  . 

The  meaning  of  the  above  terms  is  as  follows.  The  term  s^  ^  ^ 

in  Eq.  (39)  is  the  sample  variance  of  x^  when  the  best  linear  estimates 
of  x. 


x^  ,  .  .  .  ,  x^  have  been  subtracted  out.  This  is  the  reason  for  the 

Oi.  23.  .  .k 


form  of  the  notation.  The  term  p“.  ^  in  Eq.  (42)  is  the  partial 


correlation  coefficient  between  x„  and  x..  The  partial  correlation 
-  0  1 

coefficient  is  the  correlation  coefficient  between  the  variables  x„  and  x. 

0  i 
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I 


after  the  best  linear  estimate  of  the  other  variables  involved  have  been  sub¬ 


tracted  from  x„  and  x. . 

Since  the  t  distribution  approaches  the  normal  distribution  for  a 
large  number  of  degrees-of-freedom,  the  normal  distribution  may  be  sub¬ 
stituted  for  t  in  the  above  tests  if  (N  -  k)  is  larger  than,  say,  30.  This 
should  be  the  case  in  most  practical  situations.  Chapter  3,  Reference ^  4^  , 
gives  correction  factors  for  the  standard  errors  of  the  regression  coefficients 
but  have  the  disadvantage  of  being  quite  complicated  and  requiring  a  large 
amount  of  additional  computations. 

It  may  then  be  shown,  Ref.  [  3 1  ,  that  the  statistic 


has  a  t-distribution  with  (N  -  k)  degrees-of-freedom  (d.f.  ).  Thus,  one 
computes  t  from  Eq.  (43),  and  if 


ltl-tl-u/2(N 'k)  (44) 

where  t,  ,_,(N-k)  is  obtained  from  tables  of  the  student  t  distribution, 

1  -  a/  2 

then  the  hypothesis  b^  =  (3.  is  rejected  at  the  a  level  of  significance.  Note 

this  is  a  "two-tailed"  test.  The  most  usual  test  would  be  for  b.  =  0  and 

i 

Eq.  (43)  would  be  modified  accordingly  by  setting  (3  =  0. 
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8.  DIGITAL  COMPUTING  CONSIDERATIONS 


Some  basic  computing  procedures  for  regression  analysis  are  indicated 
in  Section  5.  6  of  Reference  ^ lj.  An  outline  will  now  be  given  here  of  suggested 
computations  to  implement  the  preceding  discussions  which  encompass  those 
techniques  discussed  in  Reference  ^lj  •  No  distinction  need  usually  be  made 
between  the  auto  or  cross-correlation  cases  since  the  computational  pro¬ 
cedures  are  basically  the  same. 

A  convenient  way  of  handling  the  set  of  (N  +  k)  observations  of  x.  is  to 
arrange  them  in  a  (k+  1)  by  N  matrix  as  indicated  below. 


The  product  of  this  matrix  X  with  its  transpose  X'  gives  a  (k  +  1)  by 

(k+  1)  matrix  whose  elements  are  the  discrete  points  of  the  sample  correlation 

function.  Thus 


(46) 


The  top  row  of  the  matrix  then  gives  the  k  +  1  discrete  points  of  the  auto¬ 


correlation  function  R(0),  R(l),  ...,R(k).  In  actual  computational  practice 
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for  the  autocorrelation  case,  one  only  needs  to  compute  the  first  row  of  the 
matrix  (46)  and  then  shift  this  row  to  the  right  to  obtain  the  additional  portions 
of  the  (k  -  1)  rows  necessary  to  fill  in  the  upper  right  half  of  this  symmetric 
matrix.  However,  in  some  cases  where  computing  time  is  not  important, 
one  might  perform  the  complete  matrix  multiplication  to  simplify  programming. 

The  values  of  the  correlation  function  R  (i)  would  then  be  inspected 
to  determine  what  values  of  x^  to  utilize  in  the  regression  equation.  The 
rows  and  columns  corresponding  to  the  unwanted  points  would  then  be 
deleted  from  the  general  matrix  equation. 

For  example,  to  obtain  Eqs.  (32)  of  Section  6.2,  one  eliminates  as 
indicated  below. 


1  1  ! 

R(0)  Rjl)  R(2)  R(j3)  R(4)  R(k)  . 

— f-  IH1-J —  Rf4)—  R{4-)—  Rf)-R(4)- R^4)  t 

R(2)  R^l )  R(0)  R(ll>  R(2)  R(j3)  . 

-f  Rr(-3) —  Rf2)  — R-(-H-  — R(J R)-  41(1)—  Rfl2) 

R(4)  R^3 )  R(2)  R(jl )  R(0)  R^l)  . 

Rt5)  —  R^4-)  —  Rf3-)  —R(r£)  —  Rff-)  -  R^J  ~ 
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•i  -i 


bl 

R(l) 

p  /  7  \ _ 

^2 

b3 

R(3) 

b4 

= 

W)  - 

b  . 
h 
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(47) 


The  set  of  linear  equations  remaining  after  the  appropriate  deletions  then 
would  correspond  to  Eqs.  (32)  in  Section  2.  The  reduced  system  of  equations 
indicated  by  Eq.  (47)  may  now  be  solved  directly  for  the  b^  coefficients  or 
in  some  cases  it  is  desirable  to  obtain  the  explicit  form  of  the  inverse  of  the 
matrix  of  correlation  function  values.  To  avoid  the  necessity  of  introducing 
new  symbols,  let  R^  represent  the  matrix  of  correlation  function  values 
whether  or  not  deletions  have  been  made. 
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For  digital  computing  purposes,  it  is  usually  desirable  to  work  with 
the  matrix  of  correlation  coefficients  defined  by 


''  R(0)  2 


(48) 


for  the  autocorrelation  case  and  by 


R  (i) 

r  (i)  = - SL_ 


xy  V (0)  R  (0) 
xx  yy 


R  (i) 
XY 

tr  cr 
x  y 


(49) 


for  the  cross -correlation  case.  This  normalization  requires  divisions  to 
be  performed  in  the  autocorrelation  case  and  square  roots  and  divisions  in 
the  cross-correlation  case.  However,  the  final  quantities  are  then  in  the 
range  -1  to  +1  which  provides  advantages  in  the  general  handling  of  the 
numbers  in  the  matrix  inversion  process.  Let  represent  the  matrix 

of  correlation  coefficients  and  let  A  represent  the  inverse  of  T  ,  that  is 

-1  X 
A  =  r  . 

x 

The  system  of  equations  to  be  solved  directly  for  the  coefficients  is 

then 
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Note  that  the  matrix  on  the  left  is  a  k  by  k  matrix  corresponding  to  that  of 
Eq.  (37)  with  the  first  row  and  column  deleted.  If  the  computations  are 
being  performed  in  a  real  time  control  system  and  computational  speed  is 
important,  one  would  solve  for  the  b^  directly  without  obtaining  the  inverse 
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matrix  explicitly.  However,  in  other  situations,  it  is  desirable  to  obtain... 
the  inverse  of  the  (k  +  1)  by  (k  +  1)  matrix  explicitly  since  certain  quantities 
of  statistical  interest  may  be  conveniently  obtained  from  the  elements  of  A. 

For  the  solution  of  the  equations,  a  reasonable  elimination  and  back 
substitution  method  which  takes  advantage  of  symmetry  is  termed  the 
"Banachiewiez-Cholesky-Crout  Method.11  A  description  of  this  method.  Ref.  £  8 
is  presented  here .  For  the  computational  procedure  below,  let  d..  represent 
the  elements  of  the  k  by  k  matrix  of  Eq.  (50).  The  computational  steps 
are  as  follows: 


(a) 

Define 

Yil  =  dil 

,  i  -  1,  2,  . 

•  •  ,  k 

<x  —  d 

u  ij 

.  j  =  2,  3,  . 

.  .  ,  k 
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i,j  =  2,  3,  ....  k  (51) 
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(52) 

(d)  Compute 

=  r(i) 
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If  the  elements  a.,  of  the  inverse  matrix  A  are  desired,  the  same 
ij 

basic  procedure  given  above  applies  with  minor  modifications,  except  now 
one  starts  with  the  full  (k+1)  by  (k+  1)  matrix  of  correlation  coefficients. 
In  Step  (d)  above,  the  T(i)  are  replaced  with  unit  row  vectors  e  ^  where 
e  .  represents  a  unit  vector  with  a  one  in  the  ith  position.  For  example, 

€  =  (1,  0,  ....  0)  replaces  r(l),  etc.  Steps  (d)  and  (e)  are  modified  as 
follows: 


(d1)  Compute 


Ci>  =  £ 

1  1 


Y„ 


i-1 


.-I  Y.® 

i  t—*.  m  r 
n=  1 


i=2,  3 . k+1  (55) 


Note  that  the  co.  are  now  row  vectors  instead  of  scalars. 
i 


(e1)  Compute  the  rows  a^  of  the  inverse  matrix  A 
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k+1 


co 


k+1 


a. 
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co 
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n=i+l 


a.  u> 
in  n 


i  =  k,  k-  1 . 1  (56) 


(f)  The  regression  coefficients  are  given  by 


(57) 


In  Eq.  (57),  is  the 
consideration.  In  the 
the  cross-correlation 


sample  standard  deviation  of  the  ith  variable  under 

strict  autocorrelation  case  s,  =  s=...=s,  ,  but  in 

12  k 

case  they  are  in  general  different. 
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711  Hudson  Street 

Hoboken,  New  Jersey 

Attn:  Dr.  J.P.Breslin 

University  of  Tennessee  1 

Engineering  Experiment  Station 
Knoxville,  Tennessee 
Attn:  Dr.  G.  H.  Hickox,  Director 

Woods  Hole  Oceanographic  Inst.  1 
Woods  Hole,  Massachusetts 
Attn:  Dr.  C.  O'D.  Iselin 

Director  1 

Scripps  Institute  of  Oceanography 
La  Jolla,  California 
Attn:  Adm.  O.  D.  Wheelock,  USN(Ret) 

Mathematical  Reviews  1 

80  Waterman  Street 
Providence,  Rhode  Island 


Colorado  State  University 
Dept,  of  Civil  Engineering 
Fort  Collins,  Colorado 
Attn:  Dr.  A.  R.  Chamberlain 

Harvard  University  1 

Dept,  of  Mathematics 
Cambridge  39,  Massachusetts 
Attn:  Dr.  G.  Birkhoff 

Commander  1 

Air  Research  and  Development  Command 
Andrews  Air  Force  Base 
Washington  25,  D.  C. 

Attn:  Fluid  Mechanics  Division 

Commander  10 

Defense  Documentation  Center 
Cameron  Station 
Alexandria,  Virginia  22314 

Technical  Research  Group,  Inc.  1 

Route  110 

Melville,  New  York  11749 
Attn:  Dr.  Jack  Kotik 

Oceanics,  Incorporated  1 

Technical  Industrial  Park 
Plainview,  Long  Island,  New  York 
Attn:  Dr.  Paul  Kaplan,  President 

Engineering  Experiment  Station  1 

Kansas  State  University 
Manhattan,  Kansas  66504 

Tuskegee  Institute  1 

School  of  Engineering 
Tuskegee,  Alabama 

Itek  Corporation,  Vidya  Division,  1 

1450  Page  Mill  Road 
Palo  Alto,  California 

Illinois  Institute  of  Technology  1 

Aeronautical  Laboratory 
Chicago  16,  Illinois 
Attn:  Prof.  Irving  Michelson 


UNCLASSIFIED 


UNCLASSIFIED 


